Feature selection for automated speech scoring
نویسندگان
چکیده
Automated scoring systems used for the evaluation of spoken or written responses in language assessments need to balance good empirical performance with the interpretability of the scoring models. We compare several methods of feature selection for such scoring systems and show that the use of shrinkage methods such as Lasso regression makes it possible to rapidly build models that both satisfy the requirements of validity and intepretability, crucial in assessment contexts as well as achieve good empirical performance.
منابع مشابه
Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms
One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...
متن کاملModeling Discourse Coherence for the Automated Scoring of Spontaneous Spoken Responses
This study describes an approach for modeling the discourse coherence of spontaneous spoken responses in the context of automated assessment of non-native speech. Although the measurement of discourse coherence is typically a key metric in human scoring rubrics for assessments of spontaneous spoken language, little prior research has been done to assess a speaker’s coherence in the context of a...
متن کاملUsing an Ontology for Improved Automated Content Scoring of Spontaneous Non-Native Speech
This paper presents an exploration into automated content scoring of non-native spontaneous speech using ontology-based information to enhance a vector space approach. We use content vector analysis as a baseline and evaluate the correlations between human rater proficiency scores and two cosine-similarity-based features, previously used in the context of automated essay scoring. We use two ont...
متن کاملUsing Ontology-based Approaches to Representing Speech Transcripts for Automated Speech Scoring
This paper presents a thesis proposal on approaches to automatically scoring non-native speech from second language tests. Current speech scoring systems assess speech by primarily using acoustic features such as fluency and pronunciation; however content features are barely involved. Motivated by this limitation, the study aims to investigate the use of content features in speech scoring syste...
متن کاملHill-climbing feature selection for multi-stream ASR
We performed automated feature selection for multi-stream (i.e., ensemble) automatic speech recognition, using a hillclimbing (HC) algorithm that changes one feature at a time if the change improves a performance score. For both clean and noisy data sets (using the OGI Numbers corpus), HC usually improved performance on held out data compared to the initial system it started with, even for nois...
متن کامل